Recognition of slovenian speech: within and cross-language experiments on monophones using the speechdat(II)
نویسندگان
چکیده
Though the Slovenian SpeechDat(II) database is the largest spoken language resources for Slovenian ever recorded, it belongs to the smaller speech data collections made available by the European LE2-4001 project (http://www.speechdat.org/). The aim of this paper is to analyze this new Slovenian resource and explore the possibilities of supplementing it with data recorded for other languages. The donor languages being considered are English, German, and Danish. For each of these languages four time as much speech data has been recorded (4000 speakers compared to the Slovenian 1000 speaker database). Our purely data-driven cross language tests show that serious problems are involved when porting data across languages. The problems are partly due to differences in the recording conditions (telephone line noise). Other problems can be explained by the different phonological structures of the analyzed languages.
منابع مشابه
Graphemes as basic units for crosslingual speech recognition
This paper presents our work on grapheme based crosslingual speech recognition carried out within the MASPER initiative. The performance of monolingual grapheme based acoustic models is compared to the performance of monolingual acoustic models based on phonemes. The transfer between source and target language was done using an expert knowledge approach. For the experiments, German, Spanish, Hu...
متن کاملThe clustering algorithm for the definition of multilingual set of context dependent speech models
The paper addresses the problem of designing a language independent phonetic inventory for the speech recognisers with multilingual vocabulary. A new clustering algorithm for the definition of multilingual set of triphones is proposed. The clustering algorithm bases on a definition of a distance measure for triphones defined as a weighted sum of explicit estimates of the context similarity on a...
متن کاملLocus equations determination using the speechdat(II)
This paper presents a corpus-based approach to determination of locus equations for Slovenian language. The SpeechDat(II) spoken language database is analyzed first for all available target VCV contexts in order to yield candidate subsets for the acoustic-phonetic measurements. Only the VCVs embedded within judiciously chosen carrier utterances are then selected for the (F2 vowel, F2 onset) mea...
متن کاملPreliminary Evaluation of Slovenian Mobile Database PoliDat
The following paper describes the preliminary speech recognition evaluation of PoliDat database. This new database contains Slovenian speech captured over mobile telephones. The design of database is modeled according to the SpeechDat(II) specifications. The recording of speech material and the format of the database are shortly described. The speech recognition experiment is based on slightly ...
متن کاملThe Development and Integration of the LDA-Toolkit Into COST249 SpeechDat(II) SIG Reference Recognizer
This paper presents the development of Linear Discriminant Analysis toolkit (LDA-Toolkit) and its integration into widely used COST249 SpeechDat(II) Task Force Reference Recognizer (RefRec). The crucial parts of the LDA, the determination of LDA classes, as well as the influence of the level of dimensionality reduction on automatic speech recognition performance, are discussed. Evaluation of pr...
متن کامل